Pragmatic Multi-Agent Learning
نویسنده
چکیده
Early models of procedural learning assumed actors were isolated, model-based thinkers. More recently, learning techniques have become more sophisticated as this assumption has been replaced with more realistic ones. To date, however, there has been no thorough investigation of multiple, heterogeneous, situated agents who learn from the pragmatics of their domain rather than from a model. This research focuses on this important problem and develops learning techniques that allow agents to improve their performance in a dynamic environment by learning from past run-time behavior. Humans provide a natural model of pragmatic agents situated in a multi-agent world. [1] argues that the development of distributed cooperative behavior in people is shaped by the accumulated cultural-historical knowledge of the community. Our learning techniques are motivated by this argument and use a structure called collective memory to store the accumulated procedural knowledge of a community of agents. Collective memory contains the breadth of knowledge the community acquires through interacting with each other and the world during the course of solving sequences of distinct problems. The cornerstone of collective memory is a cooperative procedures case-base that augments the agents’ first-order planner [2]; in other words, this work follows in the tradition of second-order planners, extending them into multi-agent domains. In our model of activity, each agent has her own point of view on how best to proceed, which often leads to uncoordinated and unproductive behavior. Furthermore, inefficient behavior would occur even if there was a consensus upon the best course of action to follow (perhaps legislated by a supervising agent or agreed to during community-wide communication) because of the community’s initial lack of knowledge about their uncertain domain. Through the use of collective memory, however, agents behave more efficiently over the course of solving a problem sequence for two reasons. First, individual agents develop a point of view based upon shared experiences; second, they learn procedures that capture regularities both in the task environment and in the patterns of cooperation for solving problems in task domain. That is, an agent remembers successful cooperative behavior in which she was involved, and uses it as a basis for future interactions.
منابع مشابه
Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملImproving Agent Performance for Multi-Resource Negotiation Using Learning Automata and Case-Based Reasoning
In electronic commerce markets, agents often should acquire multiple resources to fulfil a high-level task. In order to attain such resources they need to compete with each other. In multi-agent environments, in which competition is involved, negotiation would be an interaction between agents in order to reach an agreement on resource allocation and to be coordinated with each other. In recent ...
متن کاملLearning with neighbours Emergence of convention in a society of learning agents
I present a game-theoretical multi-agent system to simulate the evolutionary process responsible for the pragmatic phenomenon division of pragmatic labour (DOPL), a linguistic convention emerging from evolutionary forces. Each agent is positioned on a toroid lattice and communicates via signaling games, where the choice of an interlocutor depends on the Manhattan distance between them. In this ...
متن کاملAn Online Q-learning Based Multi-Agent LFC for a Multi-Area Multi-Source Power System Including Distributed Energy Resources
This paper presents an online two-stage Q-learning based multi-agent (MA) controller for load frequency control (LFC) in an interconnected multi-area multi-source power system integrated with distributed energy resources (DERs). The proposed control strategy consists of two stages. The first stage is employed a PID controller which its parameters are designed using sine cosine optimization (SCO...
متن کاملVoltage Coordination of FACTS Devices in Power Systems Using RL-Based Multi-Agent Systems
This paper describes how multi-agent system technology can be used as the underpinning platform for voltage control in power systems. In this study, some FACTS (flexible AC transmission systems) devices are properly designed to coordinate their decisions and actions in order to provide a coordinated secondary voltage control mechanism based on multi-agent theory. Each device here is modeled as ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998